A heterogeneity based iterative clustering approach for obtaining samples with reduced bias
نویسندگان
چکیده
Medical and social sciences demand sampling techniques which are robust, reliable, replicable and give samples with the least bias. Majority of the applications of sampling use randomized sampling, albeit with stratification where applicable to lower the bias. The randomized technique is not consistent, and may provide different samples each time, and the different samples themselves may not be similar to each other. In this paper, we introduce a novel sampling technique called Wobbly Center Algorithm, which relies on iterative clustering based on maximizing heterogeneity to achieve samples which are consistent, and with low bias. The algorithm works on the principle of iteratively building clusters by finding the points with the maximal distance from the cluster center. The algorithm consistently gives a better result in lowering the bias by reducing the standard deviations in the means of each feature in a scaled data.
منابع مشابه
A New Approach for Heterogeneity Corrections for Cs-137 Brachytherapy Sources
Background: Most of the current brachytherapy treatment planning systems (TPS) use the TG-43U1 recommendations for dosimetry in water phantom, not considering the heterogeneity effects.Objective: The purpose of this study is developing a method for obtaining correction factors for heterogeneity for Cs-137 brachytherapy sources based on pre-calculated MC simulations and interpolation.Method: To ...
متن کاملPersistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm
Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...
متن کاملProbit-Based Traffic Assignment: A Comparative Study between Link-Based Simulation Algorithm and Path-Based Assignment and Generalization to Random-Coefficient Approach
Probabilistic approach of traffic assignment has been primarily developed to provide a more realistic and flexible theoretical framework to represent traveler’s route choice behavior in a transportation network. The problem of path overlapping in network modelling has been one of the main issues to be tackled. Due to its flexible covariance structure, probit model can adequately address the pro...
متن کاملA Comparison of the Effectiveness of Cognitive Bias Modification in Real and Placebo Conditions on Attentional Bias and Approach Bias in Opium Abusers
Background & Aim: Inability to control drug use is considered a core aspect of drug dependency. Part of this inability is due to cognitive biases resulting from individuals’ constant usage of drugs. The aim of the present study was to compare the effectiveness of cognitive bias modification in real and placebo conditions on attentional bias and approach bias in opium abusers. Methods: This stud...
متن کاملBias Corrections for Two-Step Fixed Effects Panel Data Estimators
This paper introduces bias-corrected estimators for nonlinear panel data models with both time invariant and time varying heterogeneity. These models include systems of equations with limited dependent variables and unobserved individual effects, and sample selection models with unobserved individual effects. Our two-step approach first estimates the reduced form by fixed effects procedures to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1709.01423 شماره
صفحات -
تاریخ انتشار 2017